27 research outputs found
Complex Embeddings for Simple Link Prediction
In statistical relational learning, the link prediction problem is key to
automatically understand the structure of large knowledge bases. As in previous
studies, we propose to solve this problem through latent factorization.
However, here we make use of complex valued embeddings. The composition of
complex embeddings can handle a large variety of binary relations, among them
symmetric and antisymmetric relations. Compared to state-of-the-art models such
as Neural Tensor Network and Holographic Embeddings, our approach based on
complex embeddings is arguably simpler, as it only uses the Hermitian dot
product, the complex counterpart of the standard dot product between real
vectors. Our approach is scalable to large datasets as it remains linear in
both space and time, while consistently outperforming alternative approaches on
standard link prediction benchmarks.Comment: 10+2 pages, accepted at ICML 201
NagE: Non-Abelian Group Embedding for Knowledge Graphs
We demonstrated the existence of a group algebraic structure hidden in
relational knowledge embedding problems, which suggests that a group-based
embedding framework is essential for designing embedding models. Our
theoretical analysis explores merely the intrinsic property of the embedding
problem itself hence is model-independent. Motivated by the theoretical
analysis, we have proposed a group theory-based knowledge graph embedding
framework, in which relations are embedded as group elements, and entities are
represented by vectors in group action spaces. We provide a generic recipe to
construct embedding models associated with two instantiating examples: SO3E and
SU2E, both of which apply a continuous non-Abelian group as the relation
embedding. Empirical experiments using these two exampling models have shown
state-of-the-art results on benchmark datasets.Comment: work accepted the 29th ACM International Conference on Information
and Knowledge Managemen
What is Normal, What is Strange, and What is Missing in a Knowledge Graph: Unified Characterization via Inductive Summarization
Knowledge graphs (KGs) store highly heterogeneous information about the world
in the structure of a graph, and are useful for tasks such as question
answering and reasoning. However, they often contain errors and are missing
information. Vibrant research in KG refinement has worked to resolve these
issues, tailoring techniques to either detect specific types of errors or
complete a KG.
In this work, we introduce a unified solution to KG characterization by
formulating the problem as unsupervised KG summarization with a set of
inductive, soft rules, which describe what is normal in a KG, and thus can be
used to identify what is abnormal, whether it be strange or missing. Unlike
first-order logic rules, our rules are labeled, rooted graphs, i.e., patterns
that describe the expected neighborhood around a (seen or unseen) node, based
on its type, and information in the KG. Stepping away from the traditional
support/confidence-based rule mining techniques, we propose KGist, Knowledge
Graph Inductive SummarizaTion, which learns a summary of inductive rules that
best compress the KG according to the Minimum Description Length principle---a
formulation that we are the first to use in the context of KG rule mining. We
apply our rules to three large KGs (NELL, DBpedia, and Yago), and tasks such as
compression, various types of error detection, and identification of incomplete
information. We show that KGist outperforms task-specific, supervised and
unsupervised baselines in error detection and incompleteness identification,
(identifying the location of up to 93% of missing entities---over 10% more than
baselines), while also being efficient for large knowledge graphs.Comment: 10 pages, plus 2 pages of references. 5 figures. Accepted at The Web
Conference 202
Correcting Knowledge Base Assertions
The usefulness and usability of knowledge bases (KBs) is often limited by quality issues. One common issue is the presence of erroneous assertions, often caused by lexical or semantic confusion. We study the problem of correcting such assertions, and present a general correction framework which combines lexical matching, semantic embedding, soft constraint mining and semantic consistency checking. The framework is evaluated using DBpedia and an enterprise medical KB
Product Knowledge Graph Embedding for E-commerce
In this paper, we propose a new product knowledge graph (PKG) embedding
approach for learning the intrinsic product relations as product knowledge for
e-commerce. We define the key entities and summarize the pivotal product
relations that are critical for general e-commerce applications including
marketing, advertisement, search ranking and recommendation. We first provide a
comprehensive comparison between PKG and ordinary knowledge graph (KG) and then
illustrate why KG embedding methods are not suitable for PKG learning. We
construct a self-attention-enhanced distributed representation learning model
for learning PKG embeddings from raw customer activity data in an end-to-end
fashion. We design an effective multi-task learning schema to fully leverage
the multi-modal e-commerce data. The Poincare embedding is also employed to
handle complex entity structures. We use a real-world dataset from
grocery.walmart.com to evaluate the performances on knowledge completion,
search ranking and recommendation. The proposed approach compares favourably to
baselines in knowledge completion and downstream tasks
TransGCN:Coupling Transformation Assumptions with Graph Convolutional Networks for Link Prediction
Link prediction is an important and frequently studied task that contributes
to an understanding of the structure of knowledge graphs (KGs) in statistical
relational learning. Inspired by the success of graph convolutional networks
(GCN) in modeling graph data, we propose a unified GCN framework, named
TransGCN, to address this task, in which relation and entity embeddings are
learned simultaneously. To handle heterogeneous relations in KGs, we introduce
a novel way of representing heterogeneous neighborhood by introducing
transformation assumptions on the relationship between the subject, the
relation, and the object of a triple. Specifically, a relation is treated as a
transformation operator transforming a head entity to a tail entity. Both
translation assumption in TransE and rotation assumption in RotatE are explored
in our framework. Additionally, instead of only learning entity embeddings in
the convolution-based encoder while learning relation embeddings in the decoder
as done by the state-of-art models, e.g., R-GCN, the TransGCN framework trains
relation embeddings and entity embeddings simultaneously during the graph
convolution operation, thus having fewer parameters compared with R-GCN.
Experiments show that our models outperform the-state-of-arts methods on both
FB15K-237 and WN18RR